What is Markdown and why use it?

Markdown is a flexible and lightweight tool for document writing. This tutorial page, for example, was written using Markdown. It is easy to has an easy-to-write plain text format, and some basic syntax for formatting. You can add in LaTeX or HTML to Markdown to further adjust formatting, making it very flexible.

There are several flavours and extensions to Markdown and which you use depends on your context. For this workshop we will be using Quarto, which is a new (2022) piece of software built on R Markdown. Quarto is a tool for making dynamic documents, which combines markdown, and sections or chunks of embedded programming code in languages like R and Python. This is powerful as is allows you to write reports or presentations that contain your R/Python code.

Basic Markdown syntax is the same across all extensions and flavours. Two good sources of picking up the basic syntax are the markdown guide and the markdown tutorial. You can use these as points of reference throughout this tutorial.

There are three main reasons for writing reports or presentations in Quarto (Markdown) instead of LaTeX:

There are very few disadvantages to using Quarto over just LaTeX. One disadvantage is, Quarto needs an environment to work in like RStudio, VS Code, or Jupyter notebooks/lab which you might not be familiar with.

Image about R Markdown, the predecessor to Quarto | Artwork by @allison_horst
Image about R Markdown, the predecessor to Quarto | Artwork by @allison_horst

How are we going to use Quarto

For this workshop we are going to use Quarto in RStudio as this is one of the easiest environments to use Quarto from, and some sections of this workshop have optional R code.

If preferred you are welcome to use VS Code or Jupyter Notebook, but how to use them with Quarto will not detailed in the tutorial, so ask a trainer for help.

Outcomes

By the end of this workshop you will:

It might feel like there is a lot to explore. However, we split it into separate sections to make it easier for you to work through. We hope you enjoy it!

Requirements and setup

For this workshop you will need the following software and packages.

Software

Make sure you have R and RStudio installed. If you haven’t, first install R and then RStudio.

Packages

As we are using Quarto as our markdown tool, there are a few steps to go through to get it up and running.

We will need to install Quarto, select the installer which matches your device (Windows or Mac). Then follow the installation instructions. If installing on the library computers, just install it for your own profile

Next you will need to install tinytex. Depending on if you are on a Mac or a Windows the series of images below will guide you through this.

Windows

The first step on a windows machine is to search for terminal or command prompt in the search bar at the bottom right of your screen. Then open command prompt.

Once the command prompt is open, type in or copy the following command: quarto install tinytex

An install programme should run, when completed you should see a message saying installation complete (or similar). This means you’ll be ready to go to the next step.


Mac

The first step on a mac is to select the go menu on the top bar -> then utilities (or use shift+command+U)

When in the utilities screen, select the terminal to open it.

Once the terminal is open, type the following command in install tinytex: quarto install tinytex

An install programme should run, when completed you should see a message saying installation complete (or similar). This means you’ll be ready to go to the next step.


Once you have installed Quarto and tinytex, open RStudio run the below command to make sure the package is installed and loaded.

install.packages("quarto")
library(quarto)

We are now up and ready to write some pdf reports!

Introduction to Quarto

When you open a Quarto document, which has the extension of .qmd you will see an interface like this.

The grey areas are the code chunks. To run code cells you click on the green play button or use control/command + enter

The white space is where we write our text. We can use markdown formatting here.

The blue # are headers. One # means top header, two # means second header and so on. We use these to make sections in our document.

The top part (the one that starts with title and ends with output between —) is written in a so-called YAML language. Here we tell Quarto what output we want, who we are and so on; this creates metadata that is used to compile your documents.

Useful resources to use during this session

Extra resources for after the session

Exercises

You recently converted a cool article about the gender pay gap in the UK to LaTeX, which was really fun!

The author of the article was feeling generous and also sent you the code and data used to make the figures and tables in the report. This made you wonder ‘can I do this report all in one with the code and text?’…then you remember that with Quarto you can do just this! You decide it would be fun to convert the same article but this time use Quarto to see the difference. Anyway, learning new skills is fun!

Feeling organised you collate everything you need into a folder which contains:

Click link to access the files

Task 0 — load the template into RStudio

  1. Download all the files to your computer
  2. Unzip/extract the files you have downloaded. On a Mac you double click the file to unzip, on a Windows you right-click -> extract all
  3. In RStudio go to File -> Open File and open the report_template.qmd file.

You’ll notice a few things. First is the section at the top that has three dashes and some text like output, author etc. This is the YAML (YAML Ain’t Markup Language) header.

YAML in markdown is like the preamble section in LaTeX where we define what our document will look like. There are a lot of options and changes we can make. For now use this template and play around with it later!

Before going onto the next task, press the render button (or press shift+command/control+k) to compile your document so you can see what it looks like.

Task 1 — title page

We have a document template, first thing we want to do is to build up that title page!

Using the gender_pay_gap.pdf file as your example:

  • Add the title
  • Add the authors
  • Add the abstract

Hint: everything you change here is in the YAML header.

Note: the date will find today’s date for you, which is handy. If you wanted to change it manually you can also do that.

Task 2 — contents

Our title page is looking good!

Next we set up the contents page which should include a table of contents, list of figures, and list of tables. We might not have added tables or figures yet, but we will soon.

  • Add a table of contents
  • Add a list of figures
  • Add a list of tables
  • Make sure your contents page are on a separate page to the rest of the document

Do not hesitate to search for ways to do it online!

Hint: To match the document exactly you’ll need to use LaTeX, adding the LaTeX code after the YAML header (the three dashes at the top of the page). If you get stuck check out the LaTeX documentation via Overleaf. You can also do this with Quarto, and it is explained in their documentation, LaTeX is simpler though in this case.

Task 3 — organise your files

For the next tasks we will need to use the resources the author sent us.

I recommend the following:

  1. Create a new folder/directory for this workshop
  2. Add your report_template.qmd file and all the rest of the resources files there
  3. You should see the files in the file explorer (bottom right window in Rstudio, Files tab)
  4. You might get a message about file being moved or deleted. Don’t worry, just open report_template.qmd again and you’re good to go

When you can see the files in your RStudio file viewer, click on one of them like the pay_gap_bot.png file to open it. If it opens, you’re good to move on!

Task 4 — introduction

We are so organised! Now we can start adding the real content to the document.

  • Copy introduction text from the pdf into RStudio (make sure to paste in the introduction section)
  • Add the links. Click on the links to get the urls
  • Add the figure, which is the pay_gap_bot.png image file in your project
  • Make sure you’ve added a caption and your image is in the centre of the page

Hint: You can use markdown syntax to add the image and caption. The caption goes within the [].

Note: that to make the image the same as the example you will need to add sizing options to your syntax with {} with the option like {out.width = '70%'}. See the information on figures in the Quarto documentation.

A slight diversion to talk about Execution Options

This section is strictly extra information which might be helpful in your future document writing endeavours. Feel free to skip over it and move on to task 5.

Quarto works as a combination of plain text, markdown, YAML, and code chunks. Code chunks are anything that has two sets of the three ```. This is true for all markdown types and is usually shown as a different colour such as grey. The cool thing about Quarto is that is allows you to run these code chunks interactivity. To insert a new code chunk in RStudio use either Ctrl+Alt+I for Windows or Cmd+Option+I for Mac.

The syntax for a code chunk is broken down as follows:

  1. They are wrapped with ``` above and below. You will see they are a different colour to the rest of the document
  2. After the first ``` you have curly brackets {} which contain the following:
    • Which programming language you are using, which in this case is r. To run code, you need to specify the programming language used
    • You can add a name of the code chunk which helps for referencing later
    • Then you have options like do you want to show the code and output, or just the output, or change the size of a output and so on
  3. Then you have the actual code
  4. Finally you close the code chunk with another ```

Below are two examples where we use some R code to make a basic figure. The first has no execution options, the second does. Without execution options we see everything. This is useful for showing collaborators, colleagues, and friends what you’ve done, but it is less good when you want a tidy report. In this example, we see all the code, and the output.

set.seed(19)
x <- runif(100)
y <- runif(100)
plot(
  x = x, y = y,
  col = ifelse(y > 0.6 | x < 0.3, '#2BB3F1','#F26B2C'),
  pch = 19,
  main = "Example plot"
)

With execution options we can see how we can make the output more presentable if we were to use this in a report. We have added a caption, hidden the code, changed the size, and centred the output.
my interesting caption

my interesting caption

The options used are listed below, which have to go within the curly brackets.

  • I would first give the code chunk a name. Usually this is the first thing you add after the r. It will look something like {r my-plot}
  • Next, we can set up some figure parameters. There are lots of options, that start with the prefix fig.. We’ve used align and cap. It looks like {fig.align="center" , fig.cap="my interesting caption"}
  • Then we have changed the output, which starts with the out. prefix. We have used the width parameter which looks like {out.width="50%"}
  • Finally we hide the code using the echo parameter, which takes true or false, and it looks like {echo=FALSE}

You end up with chunk options that look like: {r my-plot, fig.align="center", fig.cap="my interesting caption", out.width="50%", echo=FALSE}

Some side notes on the side note is that there are other ways to define chunk options, which are well documented in the Quarto documentation. This primarily includes having a line at the top your code chunk that looks like #| fig.align="center.

Task 5 — methods

If you have not done so already render your document again so we can see the changes we made! Remember we press the render button (or press shift+command/control+k) to compile our document with Quarto. You should see that the list of figures has appeared with an entry, yay!

The methods section has more new elements for us to try out such as equations and citations.

  • Make a new section called methods, like we have for introduction
  • Add a new page between methods and introduction
  • Copy the text from the methods in the pdf into your Quarto document
  • Add the url links
  • Write the equations
  • Add the references, all of which are in the references.bib file

Equation hint 1: you can use LaTeX or markdown syntax to write equations.

Equation hint 2: There are various shortcuts available to make your equation writing easier. If I wanted to write the below equation I’d need to do $$ {2 \over 4} \times 5 $$. \[{2 \over 4} \times 5\]

Reference hint 1: there are two ways to cite in Quarto using either @author or [@author].

Reference hint 2: each reference in the references.bib file has a label which you use within the cite command like @ggtext.


General notes about references

The LSE Library team offers a lot of support and advice on citing and referencing your own work. Information on this support can be found on the library web page.

The template, rather helpfully, has been set up to handle references and citations. When using Quarto, to use references you just need to add the following two lines in the YAML header.

bibliography: "references.bib"
biblio-style: apsr

Note that the command biblio-style: apsr tells Markdown what referencing style to use, which in this case is the American Political Science Review style. There are many possibilities, so before choosing a style for your own work, you should check with your department or librarian. A full list of bibliography styles is available on this website.

As soon as you add in a reference, the references page will appear at the bottom of your document. You can change the referencing style to whatever you need and the R Markdown book or Quarto documentation provide a guide to this.

Task 7 — results

Now our links are sorted, we can do the results section which has some new elements for us to think about including a table, footnotes and figure referencing.

  • Make a new section called results on a separate page from methods
  • Copy the text from the results page in the pdf to your Quarto document
  • Add the footnote about tokenisation
  • Add the two url links
  • The copied table doesn’t work! You think writing tables from scratch in markdown seems like a lot of effort. After a Google search you find a table convert tool which will do the hard work for you, great! Upload the paygap_sector_average.csv to the table converter to get your table
  • Once your table is in your document make sure it has a caption; this documentation page should help
  • Add the figure, which is the paygap.png image file. Be sure to pay attention to the positioning of the figure, you might need to try different positions till it looks right
  • Make sure your figure has a label
  • Now you can add the figure labels using LaTeX code like \ref{fig:chunk-label} or using Markdown syntax

Figure hint 1: the figure label in Quarto will be the code chunk label if you use the knitr::include_graphics method.

Figure hint 2: figure position is determined by a parameter in the chunk options, for example: {r, chunk-label, out.width="80%, fig.pos = "p"}

Table diversion

This section is, again, strictly extra information which might be helpful in your future document writing endeavours. Feel free to skip over it and move on to task 8.

We have gone for the most basic and simple option here when adding a table, but there are many other options at our disposal because we are using Quarto. We’ve listed these options below for you to look at in your own time.

  • A nice way to present statistical data, like regression outputs, is the stargazer package
  • To make tables with code there are two really good options:
    • kableExtra which works well with pdf and HTML outputs, and is an extension of the kable function that comes with the knitr package
    • gt and the extensions to gt called gtExtras and gtsummary can make very fancy tables but work best with HTML outputs

I’ve added two examples of what we can do using kableExtra, all code examples are based on the examples from the kableExtra documentation.

This first example brings in the csv file we used for the table, then we display it using kableExtra.

# load in kableExtra
library(kableExtra)

# load in csv
sector_avg <- read.csv("paygap_sector_averages.csv")

# make table with kableExtra
kbl(x = sector_avg, booktabs = T, 
    col.names = c("Sector", "% increase in mens wages compared to womens"),
    linesep = " ", align = "c",
    caption = "Table to show the sectors with, on average, the largest 
    percent gap in men’s wages compared to women’s wages.") %>% 
  kable_styling(latex_options = "striped") %>%
  column_spec(column = 2, 
              color = spec_color(sector_avg$percent_increase_in_mens_wages_compared_to_womens, 
                                 option = "A", 
                                 end = 0.8))
Table to show the sectors with, on average, the largest percent gap in men’s wages compared to women’s wages.
Sector % increase in mens wages compared to womens
primary 0.2835
secondary 0.2494
education 0.2383
financial 0.2112
construction 0.1978
technology 0.1932
information 0.1892
head 0.1640
offices 0.1640
technical 0.1615

This second example uses the code from the sector_averages.R script, then we display the output using kableExtra. We have sourced the code, which just means we have loaded in the script which does the analysis. I’ve included the code here but you can use chunk options to stop the code showing, such as using echo=FALSE. I’ve also included message=FALSE and warning=FALSE in as chunk options so we don’t get loading messages and such. You’ll be able to see the sector_averages.R script in your resources file.

# load in kableExtra
library(kableExtra)

# load in analysis script
source("resources/analysis/sector_averages.R")

# just the top 10
paygap_top <- paygap_final %>% head(n = 10)

# make kableExtra table
kbl(x = paygap_top, booktabs = T, 
    col.names = c("Sector", "% increase in mens wages compared to womens"),
    linesep = " ", align = "c",
    caption = "Table to show the sectors with, on average, the largest 
    percent gap in men’s wages compared to women’s wages.") %>% 
  kable_styling(latex_options = "striped") %>%
  column_spec(column = 2, 
              color = spec_color(paygap_top$percent_increase_in_mens_wages_compared_to_womens, 
                                 option = "A", 
                                 end = 0.8))
Table to show the sectors with, on average, the largest percent gap in men’s wages compared to women’s wages.
Sector % increase in mens wages compared to womens
primary 0.2835
secondary 0.2494
education 0.2383
financial 0.2112
construction 0.1978
technology 0.1932
information 0.1892
head 0.1640
offices 0.1640
technical 0.1615

I should also note that the above idea of including your scripts in your document also applies to the paygap.png image we added which is generated from the paygap.R script.

Task 8 — discussion

We are almost there, just the discussion left. Two new features here are a quote and a list.

  • Make a new section called discussion on a separate page from results
  • Copy the text from the discussion page in the pdf to your Quarto document
  • Format the quote properly using our recommended resources for help
  • Add the footnote
  • Format the list using our recommended resources for help

Wow, we’ve just taken a pdf document and converted it to Quarto!

Word count tip

Last but not least you might be interested in either adding in or calculating your word count. This is not the most straightforward in Quarto, but there is a nice workaround.

Copy and paste this code and add it into your document. Make sure it is not in a code cell. This is known as inline code and produces the following: Word count: 4723. You will likely get a different word count!

**Word count**: `r as.integer(sub("(\\d+).+$","\\1",system(sprintf("wc -w %s", knitr::current_input()),intern=TRUE)))-20`

There is also a new(ish) Quarto extension which does word counting.

What’s next?

Have a go at the take home challenge, or set your own challenge like converting a report or document you’ve previously written into Quarto/R Markdown or LaTeX. You can try and use Quarto or LaTeX for reports you write this year which is great opportunity to learn.

You also might wonder if there are other options than LaTeX and Quarto. The great news is Quarto is designed to be multi-lingual, so you can do what we did today but instead use Quarto with Jupyter Notebooks or VS Code instead, using Python code for any analysis you want to add.

These days you can also write Python as well as other code in RStudio, for example to use Python you use the reticulate package. There are also other options available to code in c++, JavaScript and more.

We should also mention that Jupyter Notebooks has the ability to write publication ready reports using the Jupyter book extension. If you generally prefer Jupyter Notebooks then it is worth having a look, but from my experience it is less user-friendly and flexible than R Markdown/Quarto. I’d recommend using Quarto with Jupyter Notebooks from within Visual Studio Code (VS Code) to get the best possible experience. Check out the VS Code Jupyter extension and how to use it with Quarto; you can also do this with R and Julia.

If you want to use Markdown as a collaborative tool, we recommend HackMD, which allows several users at once to edit documents in a similar way you can with Microsoft Word via SharePoint, or a Google Doc. It has a useful tutorial and all the nice Markdown features you would need.

Take home challenge

A fun and useful challenge you can set yourself is to write your CV using Markdown! CV writing with R Markdown, the predecessor to Quarto, is well developed and has a lot of excellent examples to pull from. Quarto, as it is newer, has less development but it is still just as straightforward.

Using Quarto to write a CV

Write your CV with Quarto using the quarto cv extension. This provides Quarto template documents, which you can edit and adjust to your own needs. It is a recent development, but I am sure more templates and examples will appear in the coming months and years.

Using R Markdown to write a CV

Write your CV using R Markdown with the vitae package! This is a very convenient way of writing and keeping your CV up to date as it is data driven. What this basically means is you keep all the information about your work, skills, experiences and so on in an Excel file (or Google sheets), which you load into R Markdown to create a cool CV.

There are lots of good examples on the vitae GitHub page, but two I’d recommend are from Lorena Abad and Shazia Ruybal-Pesántez.

My suggestion on getting started is the following:

  1. Install vitae with install.packages('vitae')
  2. Instead of using the examples with the package, use Shazia Ruybal-Pesántez’s CV as your template. To do this, click on Code (green button) -> Download ZIP which will download the files for you
  3. In the data folder you will find cv_data.xlsx. You can change this so it has your experience and skills in it
  4. Load up the cv-vitae.Rmd file and change what you need, regularly compiling (knitting) to see what it looks like
  5. To make your CV exciting this is where Lorena Abad’s CV comes in. She wrote some custom formatting options in the awesome-cv.cls file. You can copy the contents of the file on her GitHub and paste it into your local copy of the awesome-cv.cls file. Knit your report to see the difference.
  6. Now have a look through her rmd file. She has icons with the headers which is a nice touch such as a briefcase with her work experience which looks like: \faIcon{briefcase} Professional Experience. Try and add some of these into your CV. hint: You’ll likely need the fontawesome package for this: https://github.com/rstudio/fontawesome